Category: Geeks r Us
You that enjoy Facebook this might be interesting and fun.
I don’t suspect it will work perfectly, but will give us an idea of what the picture is.
http://www.cnet.com/news/facebook-automated-alt-text-blind-photo-description-ai/
hmm, thanks for this! :)
Apparently Twitter has tried to do something similar (though in my view less
universally effective) by allowing users to alt tag their pictures. However, it's
not enabled by default and they need to think to do it. While I suspect some of
Facebook's descriptions will be random at first, they'll probably improve.
You can add a text description on that one from what I understand.
The user has to do it or you're right, it isn't universal.
I experienced this last night.
Needs to improve but extremely impressive start. :)
Did it tell you.
"Oh my God, that baby's ugly!"
If so, yep, needs improvement.
Laughing.
I've never gotten it to do it. I went to the address given, but it was so riddled with ads that after 5 minutes of page refreshes and third party ads, I gave up and close the window.
Wait, is this something that has to be enabled before it can be utilized? I thought Facebook would automatically turn on the feature. In any case, this has been in development for a while now. Glad to finally see its implementation.
If you update your facebook app, you should have it. If not, restart your phone, and it should. I had to do this.
If they're using pattern-recognition technology like apps we've seen before like Omoby, Google Goggles, etc., it will only improve as the database improves.
My only question is how they do improve. Who provides the initial descriptions to match generalized patterns? If the user can't rate the description that would put a hamper on things also.
A week ago I didn't see this on my phone, the only place I use Facebook mostly. But I'll have to check. I am curious.
Lol Wayne, that would be amusing! :)
For some yes, others. hahaha.
I saw it last night on my phone. It's pretty good.
I think what we'll have is the threefold problem I saw described once.
There is those of us who are lifelong blind from before the Internet, those who have lost their sight, and those who have been raised online.
I am a member of the first group, and so any advantage we get is a great advantage to me. The second group will suffer as usual due to having actually seen before and they will not likely see this as I do, though some will. It is not as good as when they were able to actually see the pictures.
And the third group has greater expectations due to having been raised online.
Well me being from the first group, and Wayne from the second, we both seem to feel good about it. I'm sure it will improve once the database for pattern recognition grows.
However, consider the following:
I saw a post from a former shipmate colleague of mine from my time in the Coast Guard. He posted a photo saying Nice day at the base. The photo said "May contain two or more people, grass and trees." Because I know that base, I know where the grass and trees are, and so could imagine them there on that spot. Considering at that location there is only one place with grass and trees on the base.
So, I could tell they'd taken the photo outside and where outside it was.
Nice.
I just saw this yesterday. pretty impressive! :) I tested it out putting 2 or 3
pictures from my trip! :) it nailed it! :d
Yes, mine finally worked as well.
I find it interesting, but I'm not sure it is all that helpful.
Probably because I'm not a Facebook heavy user though.
I'm extremely underwhelmed by this feature. A picture of a dog was labeled as, "may contain indoor." It calls pictures indoor a lot. Other times it says "may contain text," but makes no attempt to read the text. And, when it sees a person in the picture, why doesn't it try to match them to the people in your friends list and tell you who it is? It leaves almost everything to the imagination and I don't seem to have one.
This link is to a described video of an App being developed by Microsoft to describe actual scenes. See why I'm unimpressed with Facebook?
Unless they have people doing it on request, I don't think the tech will ever get as good as the human eye for describing.
This is just an atempt to give you some base of what people are talking about.
One of mine described a base ball game, and that is had two people smiling.
That gave me an idea of what it was about, because I was at the game.
Just interesting, but I'd agree not all that.
Smile.
I think it works best when you read the text of the post that goes with the picture. Obviously, in my example, grass, trees and a river wouldn't have had the type of impact on me as it did in conjunction with who it came from, where they said they were, saying they missed.
Someone talking about their dog and the picture says indoor, well that lets you know the dog is inside the house ... which may or may not be important to you depending on who you are and where they are and who they are to you.
It's more one aspect of an entire amalgamation of different parts.
This is actually true for sighted people, from what I understand. A not-so-wel-done pic of the dog, but showing the indoor setting, and any comments would let the viewer have more information.
I think as blind people we tend to think of photos as these incredible beautiful pieces of artwork, clearly demonstrating what the person wants to show. Something akin to a painting. At least I did. But I was also raised in a family who was extremely dedicated to photography and things visual. However, this really isn't the case.
I remember one Thanksgiving I had just taken the turkey out of the oven, done the final basting, and transferred it to the serving platter to cool before carving. Now if you've been in the kitchen long enough, you know it's pretty common to get a turkey that doesn't look like the image off a magazine cover. A wing or a leg has come off during transfer, maybe some skin is ripped a bit where you inserted the serving forks to help lift it out of the roaster. This is not an image that anyone in my growing up days would have ever taken a picture of.
Well, in this case I'd tried a different means for stuffing it which caused things to tenderize such that a hind quarter was nearly off. Sits fine on the platter, no problem when it comes to serving it. But my nieces immediately wanted to take shots of it to send to their friends, telling them where they were. Obviously with my preconceived bias about pictures usually being "picture perfect" I was pretty beside myself. But this is actually true of pics online more often than not. Maybe it's just that it's easier, I don't know. I actually don't know how much less-than-perfect art got done in the film-developing days. But at least online, even the pictures themselves don't convey this exact imagery I think some of us lifers probably imagine that they do.
Read some comments on some that are posted to your Facebook. What will probably surprise you is that people have questions about what is in the picture. Who is it? Where are you? Is that really your house?
If jumans make mistakes or can't decipher, I imagine AI has similar challenges.
I do like the idea of doing a pattern match to the people in your friends list. However, that assumes two things: First that your friend has an actual picture of themselves as a photo. A surprising amount do not for various reasons.
Second, that the image they do have matches closely enough to the part of the image being processed that the AI can determine it. If the person really cares, perhaps their profile photo is extremely good-looking, maybe even photoshopped. And in the image being processed, they make up a quarter or less of the image's real estate, lounging in a chair with shorts and a T-shirt on and a beer in hand. And if the face is partially obscured, AI's facial recognition is not likely to really get it.
Now if you could take this same AI and transport it back to the 70s or 80s, when pictures consisted of groups of people standing around for half an hour repositioning themselves by mere sixteenths of an inch to get the perfect shot, it could probably deconstruct a photo full of people into composit subimages with mostly one face per image. I'm not sure if that's how contemporary Facebook AI works but that's how security systems facial recognition basically does. Except, with security systems, and sighted people alike, it's consumed at a sample rate rather than a still image. Even if people are looking at a still shot, they're quite literally getting hundreds of samples each, and amalgamating the image parts in their brains. Same as how security systems that use facial recognition do it.
Both human and artificial intelligences obviously have a threshold of "being wrong" when it comes to recognition. And here is the one way human and artificial intelligences are different. Well two ways. Human intelligences can't see they're wrong because of their subjective bias -- what makes us human. But human intelligences can dynamically correct a wrong assertion about what they saw based on organic feedback.
Artificial intelligences may not have a subjective bias getting in the way of recognizing who is whom. But the problem is, and always has been, how to give it feedback. The most efficient artificial learning technology isn't any use without a means for it to know when it's wrong, how it's wrong and why it's wrong.
With more pragmatic environments like the stock market, aerospace guidance systems, military technologies and the like, feedback is delivered in measurable quantities and thresholds. For the visual arts, I honestly don't know. This is a part of computing I've never worked in though read some about. This requires the intelligence be capable of interacting with someone who can give it feedback on the accuracy of its results. And if this was military-grade, the AI would know enough to ask questions of the critic to eliminate subjective bias, keeping track of how the subject typically answers questions.
I'm not sure Facebook is really going to deploy such a system.
My guess for how it works? Probably it uses libraries of tagged images and makes decisions based on pattern matches and the preponderances of those matches. Presumably it breaks the image down into composit parts. Of course that means the image is crisp enough to provide the type of differentials that a machine can use. A human, unlike a machine, is used to a relatively fuzzy blurry environment. Compared to what an image recognizer is used to. And a sighted human is wrong far more often than right. Our wetware is just that inefficient. We're just really good at being wrong, and being wrong proverbially drives an AI nuts, if it had emotions that is.
Anyway, just my thoughts. Besides the experience I had with the one image, this has got me interested in how an AI is going to handle a very troublesome, touchy, unreliable and not very predictable environment -- the image.
I agree with all accept one point.
Depending on the picture, you can really tell exactly what is the meaning is by just looking.
It will mean something slightly different, because we have different emotions while seeing it, but everyone will get the same basics.
My daughter is good about labeling most of her pictures, but some she doesn't.
Sometimes she'll say "this is what you look like after a night of drinking."
Okay, what does that mean? Lol
Never will this tech be good enough to describe that picture.
Those of you who are underwhelmed by this feature have yet to consider that there has never been a feature like this as far as I know. Not one that doesn't rely on a human to actually describe for you. Even Tap Tap see has extremely vague descriptions for most things. Furthermore, in order for the feature to recognize and read text to you, it'd have to have a built-in OCR feature. This, in my view is unnecessary, especially with the numerous OCR apps available out there like KNFB reader, which would probably do a better job at decoding the text in those photos. Yes, it is a bit more time-consuming to OCR the photo to figure out what it says, but at least it will give you a more reliable translation of the text in it. As for the rest, well, I'm in the same group as Leo, though have admittedly not been alive for as long as he has. This was a pretty neat feature when I saw it, and I'm looking forward to future improvements to it, but am not expecting it to tell me about the faces of my friends, their expressions and what bar they're at. Any of you who do expect it, well, wake up. Dream land ain't the real world. I mean I suppose it could conceivably maybe possibly happen, but i'm not holding out hope, and am grateful for what we got.
Leo's post actually made a great deal of sense and didn't make me feel as though I was being preached at. I don't play the grateful blindy very well. Nor do I often experience guilt, for that matter. Based on other technology I've seen, I don't think that the features I've suggested are entirely outside the realm of possibility, and I see no reason not to ask for them.
No reason not to ask.
If you've never been able to see, I can understand why you'd feel it should be simple to describe a picture.
You think maybe I could be told, this is your friend Jane and she has a baby.
It is a nice day.
But, that really doesn't tell you much at all as to how the sighted view that picture.
If it told you Jane had on a blue dress, what's blue?
What is the style of that dress?
Did Jane cut her hair from the last time you touched her, or did it grow?
Maybe Jane decided to put on a pair os sun glasses for this shot.
I think it is okay, and I will have to agree, it is helpful.
If you could get your friends to label the pictures, that even help us more.
I read, things like "feeling blessed with Joyce. That tells me who's in the picture when people label them in that way.
I know who posted it, and I know who Joyce is, so.
The rest just says 2 people in the picture smiling.
Okay, I've got a bit more. Lol
So, sure, it is better then what we had.
Anything to further your fantasies / interest in AI, Voyager. my pleasure. I actually think the dissection of pictures and amalgamation of the parts will provide some rather interesting AI proofs down the road. This has been some barroom discussion over bottles with some people I know in the security industry dealing with secure systems using full or partial facial recognition.
I can count on your mind to fill in the rest of it for you, re; proofs of concept in artificial intelligence.
Good points Wayne and Maddog.
Always ask. If we don't ask and ask, we'll never get it. But don't "expect" and "demand" That's my philosophy. For my part, I do find it very underwhelming. But it's also in its infancy. Who knows how it will be as time goes by. As for the KNFB reader scanning such pictures, I don't know about the rest of you here, but I've had no viable luck reliably scanning any text on an LCD screen.
The KNFB reader doesn't describe pictures either well.
It won't tell you colors, and all I've talked about, so even if you could scan it, what value would that be?
Someone correct me if I'm wrong?
When I mentioned the KNFB reader app, I meant it more for OCRing the text that is sometimes included in the pictures that people might post onto social media so you can know what that text says. Not as a method of describing the finer details in the pictures.
As for getting it to work on other LCD screens etc, others have had luck with it. I haven't tried it myself, but I'd have to wager that the key is proper placement. While I hate suggesting it, maybe get a sighted person to show you how to line up your phone with the screen the first one or two times before trying it on your own, and see what happens.
Ah. Okay. I understand now.
TapTapSee is getting better and better all the time with its descriptions. I've even had it describe things around the item I was taking a picture of. It'll say something like, "Man in blue polo shirt next to brown computer desk."
People are behind it though.
I've seen so many blind people scream and scream for a feature like this, then turn right around and complain that it isn't yet perfect. Patience won't kill these people but you'd never know it for all their grumbling. Glad to see so many of you are giving it a chance.
You don't "give it a chance" It is given. hahaha.
It is O.K, but I don't think it will ever get to perfect.
You'll need some glasses for that. Lol
This comes from Flying Blind for you Android users.
If you have Facebook for Android, version 74 or higher, you will now benefit from the automatic reading of picture descriptions:
http://blindbargains.com/bargains.php?m=15196
Anyone use it chime in and tell us what you think?